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ABSTRACT 
The  theory  of  systematic  sampling,  a  possible  competitor 
to  random  sampling  for  problems  of  Monte-Carlo  type,  Is  discussed. 
It  is  based  on  the  asymptotic  distribution,  in  the  unit  cube  in 
a  space  of  many  dimensions,  of  a  sequence  of  points  generated 
by  congruences  (mod  l)  of  the  multiples  of  a  vector  with 
independent  Irrational  components.   Theorems  based  on  algebraic 
number  theory,  and  ones  based  on  a  statistical  treatment, 
concerning  the  rate  of  decrease  of  the  error  with  increasing 
sample  size,  are  given.   The  method  is  illustrated  by  application 
to  a  simple,  multiplicative,  stochastic  problem.   The  error  is 
seen  to  be  substantially  smaller  than  when  the  sequence  of  points 
in  the  unit  cube  is  a  random  one.   This  is  strongly  suggested 
by  the  algebraic  theorems  and  confirmed  by  the  statistical 
theorems;  however,  it  is  likely  that,  in  practical  Monte  Carlo 
problems  of  the  usual  complexity,  the  improvement  over  random 
sampling  will  be  evident  only  for  very  large  sample  sizes. 
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A  NON-RANDOM  SAMPLING  METHOD,  BASED  ON 
CONGRUENCES,  FOR  "MONTE  CARLO"  PROBLEMS 

1.   Introduction  and  Summary. 

During  the  past  ten  years,  several  studies  have  been 
made  of  systematic  (non-random)  sampling  methods,  for 
calculations  of  the  Monte  Carlo  type,  based  on  theorems 
of  H.  Weyl    on  the  asymptotic  distribution,  in  n-space, 
of  a  set  of  points  generated  by  congruences.   The  object 
is  to  find  a  computing  method,  similar  to  Monte  Carlo, 
in  which  the  error  of  the  approximation  decreases  more 
rapidly,  with  Increasing  sample  size,  than  when  random 
sampling  is  used.   Although  the  method  has  not  yet  achieved 
success  in  practical  applications,  there  seems  to  be  enough 
interest  in  it  to  warrant  giving  an  account. 

In  Section  2  it  is  shown  that  when  the  Monte  Carlo 
method  is  applied  to  the  stochastic  problems  of  chain 
reactions,  cosmic  ray  showers,  and  the  like,  the  problem  to 
be  solved  may  be  Interpreted  as  that  of  evaluating  a 
definite  integral 


(1)  I  =  /  ...;  f(x^,...,x^)dx^...dx^ 


1 
0 


over  the  unit  cube  in  a  large  number  of  dimensions.   In 
sampling  methods,  the  integral  is  approximated  by  choosing 
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a  sequence  of  points  {x^^    -* , .  .  .  ,Xj^^  ')    (n=l,2, .  .  .  ,N) , 
In  the  k-dimenslonal  cube,  and  writing 


"       (n)     ,  (n) 


(2)   h'^YZ^W^'---'^ 


as  an  approximation  to  I. 

In  the  standard  Monte  Carlo  method,  these  N  points 
are  chosen  at  rsmdora.  In  the  cube;  the  methods  of  statistics 
can  then  be  used  to  show  that   I„  ->  I,   as  N  -^  co  ,   In  the 
sense  that 

(3)   Expectation  (Ijj)=  <%>  =  ^  * 


(4)    Std.  Dev.    (IJ=  \/<(ljj-l)^>  oc  -^ 


which,  for  comparison  with  the  non-random  sampling  method, 
we  paraphrase  by  writing 

(5)   In  -  I  ^  (^(^)  i 

furthermore,  standard  sampling  theory  can  be  used  to  estimate 
the  error  of  the  sample  mean  Ij^  in  terms  of  the  sample 
variance  (which  is  usually  calculated  concurrently  with   I^j, 
by  the  computing  machine) . 

In  the  systematic  sampling  method,  the  sequence  of 
points  in  the  k-dlmensional  unit  cube  is  chosen  as  follows: 
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Let   ^i»  ^2''°'*  ^k  ^®  ^  linearly  Independent  ,  real, 
irrational  numbers.   When  these  have  been  chosen,  there  are 
no  further  arbitrary  elements  in  the  method  -  the  sequence 
of  points  is  then  fully  determined:  the  coordinates  of 
the  n —  point  are 

(6)  (x^("^..,,  x^("))  =  ({n^^],...,  (nej)  , 

where  -{x  j  denotes  the  fractional  part  of  X. 

Ho  Weyl  showed,  first,  that  the  sequence  (6)  is  asymptoticall 
uniformly  distributed  in  the  unit  cube  and,  second,  that  when  the 
points  in  (2)  are  chosen  in  this  way, 

(7)  ^N  -^  -^  ^^   N  -^  00  , 


provided  that  the  function  f(x,,..o,  x,  )   is  Riemann-integrable 

There  is  considerable  evidence  that  the  fluctuations  of 
the  sequence  (6)  may  be  substantially  less  than  that  of  a 
random  sequence »   Section  3  summarizes  this  evidence,   which 
consists  of  theorems  by  A.  Ostrowski,  by  the  writer,  find  by 
L.  G.  Peck.   These  theorems  indicate  that  for  a  sufficiently 
smooth  integrand  f(x,  ,...,  x,  ) ,   the  approach  of   I^,  to  I 
is  asymptotically  more  rapid  than  indicated  by  (5)* 

*  There  is  no  vanishing  linear  combination  of  them  that  has  all 

rational  coefficients. 
**Weyl's  first  result  means  that   Ijj  — >  I  if  f{xi,...,   x^) 

is  the  characteristic  function  of  any  rectangular  subregion 

of  the  cube. 
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Unfortunately,  the  theorems  referred  to  are  not 
broad  enough  to  be  applicable  to  practical  Monte  Carlo 
problems  because  of  the  restriction  to  k  =  1   In  Ostrowskl's 
theorem  and  to  very  smooth  Integrsmds  In  the  theorems  of 
the  writer  and  of  Peck.   A  new  approach  to  the  problem  Is 
therefore  tsiken.  In  Section  4,  by  reintroducing  statistical 
considerations.   If,  at  the  beginning  of  a  problem,  the 
numbers   ^  , . . , ,  ^,   are  chosen  at  random  from  some  continuous 
distribution,  then,  with  exception  of  choices  having  probability 
zero,  they  will  Indeed  turn  out  to  be  linearly  independent 
irrational  numbers.   We  can  therefore  regard  the   ^.j^,...,  %^ 
as  random  variables  (they  will  be  taken  to  have  a  uniform 
distribution  in  the  unit  cube,  since  there  is  no  loss  of 
generality  in  this  choice);  for  given  N   (and  a  given  function 
f(x,  ,c..,  X,  )),   Ijj  is  then  a  function  of  the  random  variables 
4-1 » • '  •  >  ir.>      and  we  may  discuss  its  statistics. 

Our  main  result  is  that,  with  the  points  in  (2)  chosen 
according  to  (6),  but  with  the  ^-^,,,,,    ^^  now  regarded  as 
random  variables,  we  have 

(8)  Expectation  Ij,  =  <!«>  =  I  > 

(9)  Expectation   |  l^-lj  =<|lj^-ll>=  0(-g^; 

(see  equation  (31)),  provided  only  that  f(x-j^,..,,  x-^)      is 
sufficiently  smooth  that  Its  Fourier  series  is  absolutely 
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convergent.   This  Is  a  sufficiently  great  improvement  over 
straight  random  sampling  that,  if  the  coefficients  implied 
in  (5)  and  (9)  do  not  differ  too  vastly,  the  systematic 
sampling  method  should  be  quite  worthwhile  in  practice. 
The  requirement  of  absolute  convergence  of  the  Fourier 
series  for  f(x, ,.o.,  x,  )   can  often  be  achieved  by  means 
of  sampling  "tricks"  such  as  the  ones  that  have  been  used, 
in  rsmdom  sampling,  to  reduce  the  sample  variance  (this 
is  illustrated  in  the  numerical  examples) . 

Section  5  contains  various  comments  on  the  method; 
and  Section  6  discusses  a  few  niomerical  examples. 

The  Improvement  over  random  sampling  appears  to  be 
rather  small  but  may  be  of  some  slight  practical  Interest. 

2.   The  Monte  Carlo  Method. 

In  the  usual  Monte  Carlo  method  of  calculating  the 
properties  of  a  neutron  chain  reaction,  one  is  concerned 
with  a  large  collection  or  ensemble  of  neutron  chains,  each 
starting  with  a  single  neutron,  and  all  developing  under 
identical  conditions,  determined  by  the  distribution  of 
fissionable  and  scattering  material.   The  chains  differ  from 
each  other,  even  though  they  are  all  started  in  an  identical 
manner,  because  the  occurrence  and  characteristics  of  the 
nuclear  events  are  determined  by  chance.   The  probability 
laws  governing  these  events  are  presvuned  known  and  one  wishes 
to  determine  the  statistical  properties  of  the  ensemble  of  chains, 
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For  this  purpose,  one  constructs  numerical  descriptions, 
mathematically,  of  a  large  number  of  representative  chains, 
by  assigning  arbitrary  values  to  the  various  quantities  that 
determine  the  sequence  of  events  constituting  the  chains. 
The  values  are  assigned  at  random,  with  probabilities 
distributed  in  accordance  with  the  aforementioned  probability 
laws.   Specifically,  the  assignment  is  made  in  the  following 
way.  there  is  supposed  given  an  inexhaustible  supply  of 
"random"  numbers,  1.  e.,  an   infinite  sequence  of  numbers. 


(10) 


>.-,,...,   A.  , 


uniformly  distributed  In  the  interval   (O,  l),   of  which  each  Is 
statistically  Independent  of  the  numbers  preceding  it  in  the 
sequence.   These  numbers  are  used  one  by  one  and  then  discarded. 
When,  In  the  course  of  constructing  the  nuaerlcal  description 
of  one  of  the  chains,  it  becomes  necessary  to  assign  a  value 
to  a  quantity  y  for  which  a  probability  distribution  P(y) 
is  given  (by  this  is  meant;   P(yo)   ^^  ^^^  probability  of 
y  <  Yq     and,  in  particular,   P(-oo  )  =  0,  P(+oo)  =  l),   then 
the  value  assigned  to  y  Is  y^  given  by  the  equation; 
P(yr))  =  ^1  >      where  x.   Is  the  next  unused  number  of  the 
sequence  (lO).   For  computing  purposes,  this  relation  Is 
expressed  in  the  forms 

(11)  yo  =  F(^i)^ 
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where  F(x)   Is  the  Inverse  of  the  function  P(y).   The 
construction  of  each  chain  Is  continued  until  all  the  events 
occurring  up  to  a  given  time  ("census  time")  have  been 
determlnedo   Then,  the  next  chain  Is  constructed,  and  so  on. 

(in  practice,  the  calculations  for  different  chains 
are  often  Interspersed,  the  events  being  taken  In  more  or 
less  chronological  order,  regardless  of  which  chain  they 
belong  to  I  but  the  final  result  Is  the  same  as  that  described 
above,  except  for  an  Irrelevant  permutation  of  the  random 
numbers . ) 

When  the  descriptions  of  sufficiently  many  representative 
chains  have  been  prepared  in  this  way,  average  properties  of 
the  chain  reaction  can  be  obtained  by  statistical  analysis 
of  them.   Suppose,  for  example,  that  the  problem  Is  to  find 
the  average  value  of  the  total  kinetic  energy  E  of  all 
neutrons  in  the  system  at  the  end  of  a  specified  time  after 
the  introduction  of  the  initiating  neutron.   We  compute  the 
value  of  E  for  each  representative  chain  from  Its  niunerical 
description  and  average  the  result  for  all  the  chains. 

The  value  of  E   (or  of  any  quantity  whose  average 
value  Is  desired)  for  a  given  representative  chain  depends 
on  the  values  of  those  "random"  nximbers  of  the  sequence  (10) 
that  entered  into  its  descrlptlono   That  is,  we  can  write, 

(12)  E  =  f(x^,  x^^-j^,  x^_|_2,...)  , 
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for  some  1.   The  function  f  must,  in  general,  be  regarded 
as  a  function  of  Infinitely  many  variables,  because  one 
cannot  tell  In  advance  how  many  random  numbers  will  be 
required  for  the  description  of  a  given  chain  up  to  the 
end  of  the  specified  time  Interval,  unless  one  Is  willing 
to  adopt  some  ruse  such  as  assigning  a  minimum  time  between 
collisions  for  a  neutron;  for,  otherwise,  an  Indefinitely 
large  number  of  events  may.  In  principle,  take  place  In  the 
Interval.   If  It  turns  out,  for  a  given  chain,  when  the  values 
of  the  appropriate  x's  are  examined,  that  m  random  numbers 
suffice  for  determining  the  chain,  then,  for  those  particular 
values  of  the  first  m  arguments  in  (12),  the  function  f  is 
Independent  of  the  values  of  the  remaining,  infinitely  many, 
arguments . 

In  general,  the  function  f  is  very  complicated,  and 
cannot  be  easily  written  down,  but  in  any  case  it  is  clear 
that  the  objective  of  the  Monte  Carlo  calculation  is  an 
approximate  evaluation,  by  use  of  the  random  numbers,  of 
the  integral 

_    1     1 
(13)  E  =  /  dx,  /  dx^. . . »f (x, ,Xp,. . o) 

0     0 

over  the  unit  cube  in  a  space  of  infinitely  msmy  dimension!^. 
Each  chain  is  associated  with  a  point  in  this  unit  cube.   E 
is  the  limit  towards  which  the  average  of  E  must  tend  as  the 
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niimber  of  mathematical  representants  In  the  ensemble  tends 
to  infinity,  because  the  variables  In  (I3)  are  random 
numbers  uniformly  and  Independently  distributed  in  the 
interval    (0,1)-   The  representative  chains  of  Monte  Carlo 
are  associated  with  points  randomly  distributed  in  the 
unit  cube. 

In  practice  one  can  restrict  consideration  to  a  finite 
number  k  of  dimensions,  and  this  leads  to  an  integral  of 
type  (1),  mentioned  in  the  first  section,.   If  k  is  sufficiently 
large  (say  100  -  1000  in  typical  cases),  the  probability  that  the 
computing  machine  will  ask  for  more  than  k   "random"  nximbers, 
in  the  construction  of  a  single  chain,  is  so  remote  that  this 
possibility  can  be  ignored  ~   1.    e«  one  can  postpone  making 
any  decision  on  what  to  do  about  it  until  it  occurs. 

J>.    Theorems  on  Systematic  Sampling 0 

Ostrowski's  results'-  •■  concern  the  case  k  =  1,   so  that 

N 

f(fO) 


(14)    I  =  /f(x)dx  ,      h=^tZ(r.^ 

0  ^-y-ln) 


His  results  are  stated  for  the  function 


1  ,     p^  <  X  <  P2 

(15)   f(x)  = 

0  ,     otherwise 


wh 


ere  0  <  p^  <  Pp  <  1  J   but,  clearly,  they  also  hold  for 
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any  function  f (x)  of  bounded  variation.   His  main 
conclusions  are  twofold: 

1.  If   ^  Is  a  quadratic  Irrational,  or  any  Irrational 
whose  continued  fraction  expansion 

P  =  a^  +  -    —    ...   (a^,a-,  , .  .  .   are  positive  Integers) 
■^    0   a-j^  +  ag  +      ^  0'  1'     ^ = 

has  bounded  partial  denominators,  1.  e. ,  satisfies 

aj<A 

for  some  fixed  A,   then 

(16)  Un  "  ■'•I  ^  constant  -jj-  . 

If  f(x)   Is  as  defined  In  (15) >  the  constant  l£  <  36A. 

2.  If   «  Is  sin  Irrational  number  whose  continued 
fraction  expansion  has  partial  denominators  Increasing 
sufficiently  slowly  so  that  q^^-^   =  0(q^^) ,   for  fixed  r  >  1  , 
where  the  successive  best  rational  approximations  to   ^  are 
p,/q,  ,   that  Is,  where 

q^  =  1  ,  q^  =  a-^   ,  q^^^  =  a^^^q^  +  q._^   , 
then, 

(17)  i^  -   I  =  (3{^) 
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This  second  conclusion  of  Ostrowskl  can  be  combined 
with  the  conclusion  of  Roth's  theorem '•-■^■'  establishing  the 
truth  of  the  Thue-Slegel  conjecture  that.  If   ^  Is  any 
algebraic  number  and  ^  >  0,   then 

q 

for  all  Integer  pairs   (p,q),   with  at  most  a  finite  number 
of  exceptions.   Since,  always. 


Pll.    1 


i  '  IT'  < 


Ql'   Qi  q^+i 


l+€, 


It  follows  that  q-,-1  =  Oi^A        )p      hence,  _if  i     Is  any 
algebraic  number  and  €  >  o,   then 

(18)  l^N  -  ^1  =   ^(^)  ' 

where   I„  and  I  are  given  by  (l4). 

The  theorem  stated  by  the  writer  In  [j>]   is  the  following! 
if   ^  , ,  .  .  ,  ^   are  linearly  independent  irrationals  in  a  real 
algebraic  number  field  of  degree  n   (where,  necessarily, 
n  >k+l),   and  If  the  function  f(x, ,o,.,  x,)   is  such  that 
the  coefficients  of  its  Fourier  series 

00.  2irl(q^x^+-  =  "+qj^Xj^) 

f(x^,.»,,  x^)    =y-  a(q^,.,.,q^)e 

t^^^l'^-^k^ 
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satisfy   the   Inequality 

in-1 


(19)      \  fMax|q.U         1  a(q-,  , .  .  .  ,q,  )  |    <co       , 

then, 

Ij^  -    I  =      (5(|)      as      N  ^  oo       . 

The  proof  given  In  [J>]   will  be  sketched:  It  was  based 
on  a  lemma  similar  to  Llonvllle's  theorem: 

Lemma;   If   ^,  , .  .  .  ,  ^,   are  linearly  Independent 
Irrationals  In  a  real  algebraic  field  of  degree  n(>  k+l), 
there  Is  a  constant   C  >  0  such  that  If  q.  ,  •  • « ,  q^^,  p 
are  any  (rational)  Integers,  not  all  zero. 


(20)   kill  +•••+  q^^ej,  -  p1  >  ,   ,n-i 

(J)  '^ 

Proof.   (Compare  Cassels'-  •' ,  p.  79)-   It  is  sufficient 
to  prove  the  lemma  In  the  case  in  which  the   ^,,...,  ^^     are 
algebraic  integers:  for,  in  the  general  case,  (20)  can  be 
multiplied  through  by  a  common  denominator  of  the   ^'s   (then, 
only  integers  appear  on  the  left)|  if  the  resulting  inequality 
holds,  then,  (20)  also  holds  as  written.   Therefore,  suppose 
the   4's  are  algebraic  Integers. 

Denote  the  conjugates  of   |.   by  ^).      , .  -  -  ,i\^,      where 

J       J        J 

ii   =   i\      >    ^   =   l,2,...,ko   Construct  the  algebraic  integer 
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and  note  that  ir  and  its  conjugates  are  all  /  0.   Denote 
its  conjugates  by  r^    ',,..,   ir^^  ,   where  ir  =  ir^  '.   The 
product  of  the  conjugates, 

is  the  norm  of  tt  and  is,  therefore,  a  rational  Integer. 
Now,   N[Tr]  f^   0|  consequently 

|N[Tr]|  >  1 
or 

(21)         k^^^l  > 


l;r(2)i.|Tr(^)| k^")  j 


Now  restrict  the  discussion  to  those  sets  of  values  of  the 
rational  integers  q  ,...,  q  ,  p  for  which 

(22)  lir^^^l  =  |q-^^^  +•••+  q^^^^^  -  p|  <  |  ; 

then, 

(23)  k("^)|  =  k(^)  +y-q^(^M  .  ^(1))| 


j=l 


and,  by  the  triangle  inequality. 


J=l 
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Call 


(24)  A  =1  +  Max  {A"^^    -    A^^ 


Note  that  not  only  must  one  of  the  q,,...,  q,  ,  p  be  different 
from  zero,  but  one  of  the  q, , . . . ,  q,   must  be  different  from 
zero,  because,  otherwise,  the  Inequality  (22)  could  not  hold. 

Therefore,  \        |q.|  >  1;   consequently,  from  {2^), 

J 

(25)       lir^"'^    <     ^   lqjl(i+    Uj""^    -    ^^-^^1)    <   A^    kjl    . 
Note   that,    by  definition    (24),    A     is   independent  of   the 

k'^'i  1  — V — 


So  far  we  have  proved  that  either  this  Inequality  holds  or 

.(l)l   .  1 


IT' 


>  2-  •   Therefore,  calling 


°"^"'j;^'V- 


we  have  the  desired  result: 


.(^^1  > 


Max  |q  1^-1 
(J)    -^ 
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The  proof  of  the  theorem  immediately  preceding  the  lemma  is 
then  as  follows  s  For  any  x,   exp  2iri^xj=  exp  27rix  j 
consequently, 


1  N 

(26)  ij^  =   YZ   ^K-°-  ^k^  nXI 


2irin(q-^^^+.  °  "+qj^^i^) 


^l--^k 


— (n) 


Prom  the  formula  for  the  Fourier  coefficients,  clearly 
a(0,0,o..,0)  =  lo   The  factor  multiplying  a(0,.oo,0)  in  (26) 
is  unityi  therefore,  we  can  form  I^^  -  I  from  (26)  by  merely 
omitting  the  term  with  q-,  =  qp  =  .  .  .  =  q,  =  0  .   This  omission 
will  be  denoted  by  a  prime  on  the  summation  sign.   That  is 


^N  -  I  -  K 


q-i  f '  '  f  qi. 


(n  n    ^      2TiN(q-^^^+=»»+q^^j^) 

=2iri(q,^,  +  -»+q,  ^,  ) 
1  -  e 


or,  by  the  triangle  inequality. 


(27)   ll^  -  I|  <^ 


a(q-|^. 


q^ , . . , q^ 


,ci^) 


sin  TrN(q-^^-^+-"+qj^^j^) 


sm  r{q^^^+-''+q^i^) 


For  any  real  x. 


I  sin  ttNxI  <  1, 


I  sin  TFxl  .>  2|x-<x>|  , 


where  <x>  denotes  the  integer  nearest  to  x|   and  therefore,  finally 


-  18 


1  , .       |a(q-,,...  ,q,  ) 

- 1\  <^.  YZ 


^1 '  •  •  •  ^k 

where  p  denotes  the  Integer  nearest  to  ^i  ^i  +  ' '  •+'lv^i<-  >      °r 
by  the  lemma, 

1       '    ^      ^  ^''^ 
I^N"  ^'  ^  2CN   XI    i^'^'^j')    la(q-^,...,qj^)|  ; 

q-l  >  •  •  •  '^Vr 

hence,  by  (19),   Ij^  -  I  =   "^(m)  >      as  stated. 

The  practical  applicability  of  the  theorem  is  limited 
by  the  fact  that  the  inequality  (19)  imposes  drastic  restrictions 
on  the  integrand  f  (x.,  , .  .  .  ,x,  )  :  f  and  all  its  partial  derivatives 
up  to  order  n  -  1  must  be  smooth  (possess  absolutely  convergent 
Fourier  series).   Still,  the  theorem  indicates  that  the  sequence 
of  points  generated  in  the  unit  cube  by  (6)  is  more  uniform, 
in  some  sense,  than  a  random  sequence  would  be.   One  seems 
therefore  justified  in  expecting  that,  even  with  not-so-smooth 
integrands,  the  error  I^^  -  I  may  decrease  more  rapidly  than 
(^(I/v/N)  ,   although  perhaps  not  as  rapidly  as   d(l/N). 

The  result  was  greatly  improved  by  L.  G.  Peck,  who  proved 

[41 
the  following  theorem  ■'  %     _If   ^  , .  ,  .  ,  |   are  in  a  real  algebraic 

field  of  degree  k  +  1  ,   and  if  the  Fourier  coefficients  of 

f  (x^, .  .  .  ,Xj^)   satisfy  the  inequality 

„8)   |.„,,...,V1  <  ^^^ 

(j)    ^ 
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for  some  C  <  oo   and  some  ^   >  0   ,      then  I..  -  I  =  0{-^)  • 

To  compare  the  two  theorems,  we  may  take  the  case 
n  =  k  +  1  In  (19)°   In  order  that  an  Inequality  of  the  form 
(19)  Imply  (28),  there  Is  required  only  an  Infinitesimal 
Increase  In  the  exponent  of  (19) I  whereas,  In  order  that  an 
inequality  of  the  form  (28)  imply  (19)*  there  is  required 
an  increase  by  k  -  1  of  the  exponent  of  (28).   Therefore, 

(28)  covers  essentially  a  wider  class  of  functions  than  (19)* 
unfortunately,  functions  having  a  discontinuity  of  value  or 
of  low  order  partial  derivatives  are  still  not  covered,  when 
k  is  large. 

4.   Statistical  Treatment. 

With   I  and  Ij.  defined  by  (l),  (2),  (6),   we  now 
regard   |-,  , .  ,  . ,  ^   as  random  variables,  uniformly  and 

independently  distributed  between  0  and  1  ,   and  we  consider 

the  distribution  of  the  values  of  !„  ,   for  fixed   N  and  a 

fixed  function  f(x, ,..»,  ^w) -      We  shall  prove;   if  the 
Fourier  series 

oo_  2Tri(q  X  +.-»+q  X  ) 

f(x^,o,.,  Xj^)  =  y^  a(q^,...,  qj^)e 

^^^l-"-V 

is  absolutely  convergent 1  that  is,  if 

00 

(29)  y~  ia(q^,.o.,qj^)|  =  B  <  00   , 
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then, 

(50)  <1^>   =  I  , 


(31)  <|In  -  I|>  <  i±f^B   , 


where  <>  denotes  expectation  value. 

To  prove  (30),  note  that  the  expectation  value  of 
each  term  of  (26)  vaxilshes,  except  for  the  terms  for  which 
q  =  ...  =  q  =  0  J   sind  these  terms  give  (50)  because 
a(0,...,0)  =  I.   The  terra-wise  Integration  Is  Justified  by 
the  absolute  convergence. 

To  prove  (31),  we  start  from  {27),    each  side  of  which 
Is  to  be  Integrated  over  the  k-dlmenslonal  unit  cube 
0<4-,<l,...,0<^,  <1.   Because  of  the  absolute 
convergence,  and  because  of  the  posltlvlty,  we  can  equally 
well  Integrate  any  given  term  of  the  series  over  any  other 
region  (say,  rectangular),  of  unit  volume,  that  Includes  a 
complete  period  of  the  variation.   For  given  values  of 
q.,,...,  q,  ,   consider  the  slab 


- 1  <  ^1^1  +•••+  ^k^k  <i 


In  the  k-dlmenslonal  space  ^  , .  .  .  ,  ^  .  If  we  take  an 
appropriately  oriented  cartesian  coordinate  system,  the 
corresponding  term  In  (27)  depends  only  on  the  coordinate 

-  21  - 


normal  to  the  faces  of  this  slab^  furthermore,  the  thickness 
of  the  slab  is  equal  to  the  period «   We  may  therefore 
integrate  over  a  rectangular  region  consisting  of  a  piece 
of  this  slab  having  cross-section  area  equal  to  the  reciprocal 
of  its  thickness.   Denoting  this  region  by  P  ,   we  have 

I  sin  TN(q^|^+-«+qj^^j^) 
^'"l        Isln  IT   (q^4^+..»+q^^^)   ^^l"""  ^^k 


V2 
/ 

-  V2 


sin  TNy 


sin  iry 


dy 


To  estimate  this,  we  split  the  integral  into  2  parts s  for 

|y|  <  gi  ,   we  use  the  bound   jfj^-^l  <  N  J  for  ^  <  |yj  <  1  , 

we  use   I  sin  irNyl  <  1  and   |sin  iryl  >  2|yj  i   the  above  integral 

is  then  less  than 

_1         1 
2N  2  . 

J      Ndy  +  2   f  -^  =  1  +  inN  . 

_1      _i 

2N         2N 

therefore,   <|In  "  l|>  <  '^'^^     B  ,   as  stated. 

To  make  use  of  this  result,  one  would  presumably  msike 

a  number,  say,   M  ,   of  random  choices  of  the  vector   ( ^, , . . . ,  |  )| 

for  each  of  these  choices,  one  would  generate   N  chains  of  events, 

as  described  in  Section  1,  using  Weyl's  sequence  (6)  (there  are 

then  NM  chains  in  all).   One  would  then  have  M  values  of   I„  t 

N 
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their  average  would  be  taken  as  an  estimate  of   I  and 
their  dispersion  as  a  measure  of  the  probable  error.   The 
numerical  examples  in  Section  6  show  that  the  resulting 
mean  error  is  substantially  less  than  would  be  expected  if 
the  NM  chains  had  been  chosen  at  random. 

However,  the  foregoing  theorem  leaves  unanswered 
various  questions  about  the  distribution  of  the  values  of 
I„  ,   for  fixed  N  ,   as  the   ^, , . . . ,  ^   vary  randomly; 
there  is  no  indication  in  the  above  that  this  distribution 
is  even  approximately  normal,  even  for  large  N.   Therefore, 
in  particular,  the  dependence  of  the  standard  deviation 


6  =  v  <(I^  -  I)   >  on   N  is  an  open  question.   [If  the 
distribution  were  normal,  the  standard  deviation  would  be 


equal  to  J  "^       times  the  first  absolute  moment  <|lj,  -  l|> 

and  hence  it  would  also  vary  as   (5'( -^ ).]   This 

question  is  of  some  slight  practical  importance,  because 
the  probable  error  of  the  sample  mesin  (mean  of  the  values 
-of  Ij,)  depends,  according  to  the  central-limit  theorem,  on 
(5  rather  than  on  <|l»j  -  l|>   (d  exists  and,  in  fact,  the 
distribution  is  bounded,  so  that  all  its  moments  exist);  but 

*         r  1 

an  error  estimate  based  on  <|I^  -  I|>  is  probably  good 
enough  for  practical  purposes. 

*  i.e.,  using  the  formula  that  would  be  correct  if  the 

distribution  of   I„  were  normal. 

N 
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Comments . 


Some  mlscellaineous  comments  are  collected  In  this  section. 
a.  Number  of  digits  needed.   In  a  computing  machine 
the   ^i  , .  ,  . ,  ^,   are  approximated  by  rational  numbers  (binary 
or  decimal  fractions) i  the  question  arises  as  to  the  number 
of  digits  that  must  be  retained  in  order  that  the  error  due 
to  this  source  is  no  greater  than  the  Inherent  error  of  the 
method,  for  given  N.   The  answer  is,  roughly,  that  slightly 
more  than  2  log-,Q  N  decimal  places  or  2  log^  N  binary 
places  should  suffice .   To  see  this,  suppose  that  each  of 
the   ^-^  , ,  »  .  ,  ^,   is  in  error  by  an  amount  of  order   6|  , 
Then,  a  given  component  of  the  vector   (xj  ',..,,   5cA  '),    say 
x)^    ' ,      that  is,  \n^.i,      is  in  error  by  an  amount  of  order 
n6^  <  N5^  »   The  resulting  error  of  f(xj"',..  =  ,  y^-^')      is 
of  order  _<  KN64  ,   where 


K  =  ^^^(X3_,...,x^)  '^^(j) 


o^  1^2  ,  .  o  .  '^]^' 


<)Xj 


Each  of  the  k  components  of   (x-,,...,x,  )   can  contribute 

such  an  error,  and  these  errors  must  be  compounded.   However, 

f(x, ,...,x,  )   does  not  generally  depend  on  all  k  of  its 

arguments,  because  k  is  chosen,  as  stated  in  Section  1,  to 

be  much  larger  than  the  expected  number  of  "random  numbers" 

*  In  the  examples,   f(x, ,oo.,x,  )   is  continuous  and  has 
piecewise  continuous,  hence   bounded,  first  partial 
derivatives. 
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needed  to  construct  an  individual  chain.   We  may  define  an 
effective  number  of  dimensions,  depending  on  the  function 
f  (x-,  , .  .  .  ,x,  ) ,   as  follows:   as  the  coordinates  x,  , .  .  . 
of  a  given  point  in  the  cube  are  examined,  in  succession, 
in  the  construction  of  a  chain,  it  is  found  that  only  a 
certain  number  of  them,  say  x,  ,...,x,^  are  needed  to 
construct  that  chain.   In  the  neighborhood  of  this  point 
in  the  cube,   f(x  ,...,x,  )   is  thus  actually  Independent 
of  the  remaining  variables  ^i,*  . -i  >  •  •  •  >x,   (generally  k*  «  k) . 
The  effective  number  of  dimensions  k^  is  then  defined  as 
the  average  of  k*  over  the  unit  cube.   k„  is  of  order  10 
for  simple  Monte-Carlo  problems,  even  though  many  more  of 
the  variables  x.,  , .  .  . ,  x,   may  occasionally  be  needed  for 
the  construction  of  an  exceptional  chain.   Therefore,  the 
error  of  a  single  value  of  f  (x-,  , .  .  .  ,x,  )   is  of  order 
<  kj-j  KN5^  .   The  error  of  the  average  (2)  is  also  <  k^  KN5^  , 
even  if  no  allowance  is  made  for  possible  mutual  cancellation. 
On  the  other  hand,  the  inherent  error  of  the  method  can  hardly, 
in  any  case,  be  less  than  about  I^N   (see  next  paragraph), 
hence  we  want 

ko  N6e  <  i   , 
or,  if  the   4  ,...,  4   are  themselves  of  order  unity, 
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02) 


loSio  if  >  2  log-Lo  N  +  log^Q  kQ   , 


log2  57  >  2  logg  N  +  logg  ICq 


In  the  examples,  with  N  '^  10   ,  kQ  •~'  10  ,   it  follows  that 
nine  decimal  places  should  sufflcei  eleven  were  used. 

b.  Comparative  rates  of  convergence  of  various 
procedures.   In  any  method  in  which  there  is  a  sequence  of 
points  in  the  cube  and  the  integral  is  approximated  by  the 
average  (equation  (2))  the  error  is  certain  to  _be  >  &{-^)  > 
unless  there  is  some  rule  or  procedure  for  stopping  the 
sequence  at  those  exceptionally  fortunate  values  of  N  where 
the  error  is  smaller,  because  the  difference  between  successive 
approximations  1^     and  Im,  n   can  be  of  order  p"  ^  w  '   where 

-X- 

R  =  range  of  f(x,,...,x,  )  =  maximum  of  f  (x-,  , .  .  .  ,x,  ) 
-  minlmiom  of  f  (x,  , .  .  .  ,x,  ) .   In  any  method  of  this  kind  that 
gives  this  small  an  error,  one  may  say,  roughly,  that  errors 
introduced  early  in  the  sequence  are  eventually  almost 
completely  jorapensated,  leaving  a  residual  :hat  is  not 
appreciably  larger  than  the  error  of  the  last  term.   Hence, 
the  estimates  (l6),  (l8),  (51)  are  about  as  good  as  could 
be  hoped  for. 

*  In  the  preceding  section,  this  lower  bound  of  the  error  was 
used  I  there,  however,  we  wrote   K  -,  (maximum  magnitude  of 
the  l£^  derivatives)  instead  of  -^  B.°,    in  practical  cases, 
these  will  be  of  the  same  order  of  magnitude. 
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It  Is  interesting  to  compare  the  present  method  and 
the  Monte-Carlo  method,  regarded  as  methods  for  evaluating 
multi-dimensional  integrals,  with  methods  using  a  lattice 
of  points  -  S£iy,  the  trapezoid- rule  method  .   If  each  of 

the  k  edges  of  the  unit  cube  is  divided  into  m  intervals, 

k  ** 

there  will  be  m   points  in  the  lattice   .   The  trapezoid 


rule  gives  then  just  the  formula  (2),  but  with  N  restricted 
to  values  of  the  form  m   .   The  e; 
formula  is   (^(Ax)  =  0{-)    ;    i.e.. 


to  values  of  the  form  m   .   The  error  of  the  trapezoid  rule 


'm' 


In  -  I  -  <='(^)  ' 


compared  with   C5(1/v/n)   for  random  sampling.   In  three  or 
more  dimensions,  the  Monte- Carlo  method  Is  better  than  the 
trapezoid  rule,  at  least  for  large  N  .   By  means  of  more 
elaborate  quadrature  formulas,  using  a  lattice  one  can 
achieve   (5((Z^)  )   for  numerical  integration  of  smooth 
functions,  i.e. 


^N  -  ^  =  ^(^)   • 


*  For  periodic  functions,  the  trapezoid  rule  is  preferable 
to  Simpson's  rule. 

♦♦Corresponding  points  on  opposite  faces  of  the  cube  may  be 
identified,  because  f(xi,...,xk)  is  periodic?  then,  all 
xa^     points  have  equal  weight  in  the  numerical  integration. 
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still,  Monte  Carlo  is  better  for  large  N  ,   in  five  or 
more  dimensions.   The  method  of  systematic  sampling  is 
asymptotically  better  than  the  trapezoid  rule  already  in 
two  dimensions  and  better  than  second-order  quadrature 
formulas  in  three  dimensions. 

c.  Conjectured  result  for  algebraic  numbers.   It 

* 

is  conjectured  that,  for  reasonable  integrands,  if 

i-^'-'-'i^Q     are  algebraic,  and  if  «  >  0  ,   then. 


It  was  observed,  in  Section  3,  that  this  is  true  in  one 
dimension   (k=l),   by  combining  a  result  of  Ostrowski 
with  Roth's  theorem.   But  more  can  be  said:  it  has  been 
proved  that  this  is  always  true  for  sufficiently  smooth 
integrands  (for  example  analytic)  and  one  can  show  that 
the  introduction  of  certain  discontinuities  does  not  disturb 
it.   We  can,  in  fact,  add  to  the  smooth  function  any  number 
of  functions,  each  depending  only  on  some  linear  combination 
of  the  coordinates,  that  is,  of  the  form 


f  (x-^, .  .  .  ,x^)    =  g(a-^x-^+-  •  -H-aj^x^) 


*  If  the  reader  insists  on  a  more  definite  statement  of  the 
conjecture,  let  us  say  either  that  f  (xi, .  .  .  ,xi<:)   is 
piecewise  dif f erentiable  or  continuous  and  piecewise 
diff erentiable. 
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where  the  a^,...,a,   are  rational  coefficients,  and  where 
g(w)   Is  smooth  except,  say,  for  a  finite  number  of  simple 
discontinuities.   This  Is  easily  seen,  because.  If 
f(x, ,...,x,  )   Is  periodic,  we  can  Just  as  well  write 
x^"^  =  n^.   Instead  of  x^."^  =  fn^A   ,      so  that 

a^x-j^+.-+a^Xj^  =  n(a-^^-^+---+aj^^^)  j 

and  the  Integral  of  g  therefore  reduces  to  the  one-dlmenslonal 

case,   °^i^i+°  ° ''+^k^k  ^^i^^S  an  algebraic  number^ 

We  have  not  been  able  to  prove  or  disprove  the  above 
conjecture. 

d.  Special  choices  of  the  algebraic  Irrationals, 
If,  as  suggested  by  the  above  conjecture,  a  favorable 
limitation  of  the  error  can  be  obtained  using  a  single  set 
of  algebraic  nxirabers  for   ^-i  >•■■■■>  ^i,  >      the  question  arises 
how  to  choose  these  numbers  to  give  the  best  results. 
J.  von  Neiimann  suggested  to  the  writer,  many  years  ago,  that 
one  should  try  to  obtain  results  based  on  a  particular  choice, 
rather  than  general  theorems  of  the  kind  given  in  Section  ^, 
For  Instance,  the  reasoning  In  Section  3  suggests  that  the 
4-,,...,^   should  be  so  chosen  that  the  dlophantine  inequalities 

*  they  could  also  be  algebraic. 
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are  as  strong  as  possible.   Since,  for  the  best  approximants, 
according  to  the  theory  of  continued  fractions 


Pi 

^-7r\  < 


^1  -  ^1  ^1+1 


It  seems  that  the  ^'s     should  be  so  chosen  that  the  q. 

Increase  as  slowly  as  possible.   Therefore,  the  partial 

denominators  of  the  continued-fraction  expamsions  should 

be  as  small  as  possible.   Some  numerical  experimentation 

has  been  performed,  with   Ct  *  •  °  '  »  ^;<-  ^o  chosen  that  their 

continued-fraction  expansions  have  only  I's  and  2's 

as  partial  denominators.   The  results  did  not  seem  to  be 

noticeably  better  than  with  other  choices  that  have  been 

tried.   The  reason  may  be  that,  even  if   ^-,   and   ^„  are 

two  such  numbers,  there  is  no  apparent  reason  for  supposing 

that   ^-j  +  4p  ,   or  other  simple  linear  combinations,  will 

also  have  small  partial  denominators. 

5-  An  alternative  to   I„  as  estimator  of  I.   In 

N 

equation  (2),  which  expresses   I„  as  an  average  of  the 
values  of  the  function  at  N  points  in  the  cube,  all  N 
points  are  given  equal  weight.   In  random  sampling,  this 
is  appropriate,  because  the  points  are  statistically 
independent,  hence  uncorrelated.   In  systematic  sampling, 
there  are  correlations,  in  fact,  long-range  ones,  among 
the  points  of  the  sequence j  conceivable,  unequal  weighting 
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might  be  better.   The  following  heuristic  argiiments  lead 
to  a  set  of  unequal  weights. 

If  the  successive  values  of 

are  plotted,  on  a  graph,  against  m  ,   the  points  on  the 
graph  approximate  a  straight  line  whose  asymptotic  slope 
is  equal  to   I.   The  problem  is  to  estimate  the  slope 
from  a  finite  number  of  points. 

In  random  sampling,  the  successive  increments 
S  .  -  S   in  the  graph  have  equal  probable  errors,  and 
are  uncorrelated;  in  this  case,  the  best  estimate  of  the 
slope  is  obtained  by  using  only  the  first  and  the  last 
points  of  the  graph:  this  gives  equal  weight  to  all  the 
Increments . 

The  estimates  of  Sections  3  and  4  give   1^,-1  much 
closer  to   (^(m)   than  to  d{ — )  ;   and  this  suggests  that, 
for  systematic  sampling,  it  is  the  actual  coordinates  of 
the  points  in  the  graph,  rather  than  the  increments,  that 
have  equal  probable  error.   If  this  is  so,  one  should  get 
a  better  estimate  of  the  slope  of  the  graph  by  a  least- 
squares  fitting.   This  leads  to  the  formula 

(34)  i^=^y-    ^^li^tti^  f(x(-),...,x(-)) 

N   NZ__(^)  (n+1)(N+2)     ^        ^ 
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in  place  of  1^^      One  can  still  prove  that 

(although  the  proof  Is  harder)  when  the  ^i>'''>iir     ^^® 
treated  statistically,  as  in  Section  4,  if  f (x, , « . . ,x,  ) 
has  sui  absolutely  convergent  Fourier  series.   In  the 
numerical  examples,  the  improvement  in  going  from  I„  to 
Ij^  was  disappointingly  small. 

6.  Numerical  Examples, 

Eiy  way  of  test,  the  method  was  applied  to  the  following 
simple  multiplicative  problem.   At  time  t  =  0  ,   a  single 
particle  is  presenti  at  any  instant,  any  particle  can  divide 
into  two  particles  I  the  probability  of  division  of  a  given 
particle  in  an  interval  At  is  taken  as  constant  XAt  j 
and,  in  fact,  the  constant  is  set  equal  to  1  by  suitable 
choice  of  the  unit  of  timei  no  other  processes  are  considered; 
the  only  quantity  of  interest  is  the  number  m(t)   of  particles 
present,  as  function  of  the  time  t  .   This  problem  can  be 
treated  by  elementary  methods,  according  to  which 


<m('t)>  -  e^ 


<(m(t)  -  e^)2>  =  e^^  -  e^   i 
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and.  In  fact,  if  P^Ct)   is  the  probability  of  having  Just 
m  particles  present,  at  time  t  ,    then  Pj-C^)   is  the 
coefficient  of  x   in  the  expansion  of  the  generating 
function 


x+(l-x)e^ 

We  shall  construct  each  chain,  up  to  a  time  t  =  T  , 
called  the  census  time,  and  we  wish  to  estimate  the  expected 
number  <m(T)>  of  particles  present  at  time  t  =  T  . 

When  the  random  numbers  x.,  ,  Xp,...,  x,  ,   i.e.  the 
coordinates  of  a  point  in  the  k-dimenslonal  cube,  are 
available,  the  construction  of  a  chain  proceeds  as  follows. 
We  adopt  the  convention  (purely  for  purposes  of  terminology) 
of  saying  that,  when  a  particle  divides,  it  dies  or  disappears 
and  two  new  particles  are  born.   We  keep  a  running  list  (stored 
in  the  machine)  of  the  times  of  birth  of  all  particles  whose 
times  of  death  we  have  not  yet  ascertained.   At  the  beginning 
of  the  calculation  of  this  chain,  the  list  contains  just  one 
entry,  namely  the  time  t  =  0  for  the  particle  present  at 

the  beginning.   We  examine  the  entries  of  the  list,  one  by 

* 

one,  and  ascertain  the  time  of  death  by  adding   -  inx  to 

*  In  the  actual  machine  work,  we  store  e    throughout, 
-  instead  of  t  ,   to  avoid  the  calculation  of  logarithms; 
we  then  multiply  e~^  by  x  . 
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the  time  of  birth,  where  x   Is  the  next  unused  random 
number  of  the  sequence  x-,  , .  .  .  ,x,  .   Thus  the  probability 
of  the  particle's  survival  to  time  t  Is  exp  -(t-tlme  of  birth) 
If  this  time  of  death  Is  >  T  we  simply  strike  the  particle 
off  the  list  J  If  It  Is  <  T  we  also  strike  It  off,  but  we 
then  add  two  new  particles  to  the  list,  with  birth  times 
equal  to  the  time  of  death  of  the  old  particle.   This  procedure 
is  continued  until  the  list  is  exhausted;  the  description  of 
this  particular  chain  is  then  complete.   The  crude,  simple, 
way  to  estimate  <m(T)>  would  be  to  take  f(x, ,...,x,  )   equal 
to  the  actual  number  of  particles  m(T)   present  in  the  chain 
at  time  t  =  T  .   This  is  equal  to  the  total  number  of  particles 
struck  off  the  list  during  the  procedure  just  described,  and 
Ccin  be  counted  by  the  machine  during  the  construction  of  the 
chain.   In  this  case,   f (x, , . . , ,x,  )   Is  a  step  function  with 
Integer  values. 

To  find  a  smoother  function  of  the  x-,,...,x,   as 

1     '  k 

estimator  of  <ra(T)>  ,  we  proceed  as  follows.  If  d(t)  is 
the  number  of  divisions  that  have  occurred,  up  to  time  t  , 
then,  clearly, 

ra(t)  =  1  +  d(t)  , 
<m(T)>  =  1  +  <d(T)>   , 

because  each  division  increases  the  number  of  surviving 
particles  by  1.   We  estimate  d(T)   by  looking  at  the  birth 
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times  (including  t  =  0   for  the  original  particle)  and 
computing,  for  each  particle,  the  probability  that  a  particle 
born  at  that  time  will  die  (i.e.  divide)  before  t  =  T.   The 
sum  of  these  probabilities  is 

^_        -(T-t  ) 
(55)  ci(T)  =\-^    ^    [1-e     ^  ]    , 

where  the  sum  is  over  all  the  particles  in  the  chain,  and 
t.   is  the  time  of  birth  of  the  i—  particle.   Clearly, 
d(T)   is  a  continuous,  piecewise  diff erentiable  function 
of  x^ , .  .  .  ,  X,  ;   amd 

<d(T)>  =  <d(T)>   . 

f(x,,...,x  )   is  taken  as  1  +  d(T).   This  choice  of 
f(x, ,...,x,  )   is  an  example  of  a  technique,  called  the  use 
of  expected  values  by  Kahn'-'S  for  reducing  the  variance  m 
Monte  Carlo  calculations. 

The  actual  construction  of  the  chains  is  the  same  as 
described  above  except  for  one  modification:  in  order  that 
f(x, ,...,x,  )  have  an  absolutely  convergent  Fourier  series, 
it  does  not  suffice  that  it  merely  be  continuous  and  piecewise 
diff erentiable  in  the  unit  cube;  it  should  also  have  equal 
values  at  corresponding  points  on  opposite  faces  of  the  cube, 
so  that  the  periodic  function  given  by  the  Fourier  series  is 
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continuous  in  the  entire  k-dimensional  space.   To  achieve 
this  is,  however,  very  easy:  whenever  a  random  number  is 
needed,  one  uses   |2x-l|   Instead  of  x  Itself,  where 
X  is  the  next  unused  member  of  the  sequence  x-,  , .  .  .  ,x,  . 
|2x-l|   is  also  uniformly  distributed  in   (0,l),   if  x 
is  I  and,  by  this  trick,   f (x, , . . . ,x,  )   is  symmetrized  about 
the  mid-planes  of  the  cube. 

In  order  to  compare  our  results  with  those  of  random 
sampling  (Monte  Carlo),  we  must  have  an  estimate  of  the 
probable  error  of  the  random  sampling,  using  the  same 
function  f  (x-,  , .  .  .  ,x,  )  .   (if  we  use  the  expected- values 
technique  described  above,  it  is  only  fair  to  permit  the 
same  in  the  Monte  Carlo  method,  whereby  its  error  is  somewhat 
reduced.)   This  is  given  by 


p.e.  =  0.6745 


6 


yN 


where 


(36)  6   =     y<f2>  -  <f>2    . 

2 
The  expectation  value  <f  >  is  rather  difficult  to  compute 

analytically,  in  the  present  problem,  but  an  adequate  (in  fact, 

better- than- adequate)  estimate  can  be  obtained,  concurrently 

with  that  of  \fy   ,   by  the  computing  machine,  using  the 

same  method  (i.e.,  the  systematic  sampling).   This  we  have 
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done,  in  the  examples. 

In  a  first  series  of  tests,  the  ^i*-'->iir     were  taken 
as  the  fractional  parts  of  the  numbers 

\/2 ,  \/3,  V^,  V5,  Vio,  Vi5,  \/5o,  vT,  •  •  • . 

The  radlcands  are  square-free  Integers,  taken  in  an  order 
that  is  easy  for  the  machine  to  generate.   It  is  known  that 
the  above  are  linearly  Independent  irrationals. 

Figure  1  shows  values  of  the  error   Ij^  -  I  for  several 

« 

problems.   In  each  problem,  there  were  N  =  21,000  chains  . 

The  problems  have  different  values  of  the  census  time  T  ; 

T 
the  abscissa  is  the  quantity   e   ,   which,  as  noted  above, 

is  the  expected  niunber  of  particles  at  t  =  T  and  is  equal 

to   I  .   The  curves  give  the  probable  error  of  the  corresponding 

calculation  by  random  sampling;  on  a  statistical  basis,  half  the 

points  should  lie  between  the  two  curves,  and  half  outside. 

It  is  seen  that  the  error  is  reduced,  relative  to  random 

sampling,  by  a  factor  that  ranges  from  about  1/2  to  about  1/35 

(the  last  case  Is  probably  fortuitously  good).  1^   comparison, 

the  statistical  treatment  of  Section  4  (there  is,  of  course, 

no  reason  why  it  should  apply  here,  since  a  single  choice  of 

the  numbers  i-.,...,L       has  been  made  here)  would  suggest  an 

*  This  number  was  determined  by  certain  limitations  of  tlie 
computation  program  that  are  probably  of  no  Interest  to 
the  reader. 
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improvement  by  a  factor  of  the  order   ((l+inN)/iNl)/\/N»  I/I3  . 

Figure  2  shows  the  results  of  some  calculations  by  the 

1  T 

statistical  method  of  Section  4,  all  for  the  case  e  =  2  . 

Five  values  of   N  were  taken:   N  =  1000,  2000,  4000,  10,000, 

21,000  .   Each  point  represents  the  results  of  37  different 

calculations   (M  =  37),   with  ^i'---'^i^     obtained  from  a 

random  number  generator  that  has  been  used  and  tested  in 

Monte  Carlo  work.   What  is  plotted  is  the  average  of  the 

57  values  of   jlj^  -  l|   thus  obtained   (I  =  2).   Although 

M  =  37  represents  a  rather  small  sample,  this  average  should 

not  differ  too  much  from  the  expectation  value  <)l^  -  l|> 

discussed  in  the  theorem.   The  data  are  too  meager  to  tell 

whether  a   (l+inN)/N  law  is  followed;  and,  besides,  the 

theorem  gives  only  an  inequality;  furthermore,  the  value 

of  constant  in  the  theorem  is  not  known.   However,  if 

random  sampling  had  been  used,  and  if  the  sample  size 

(i.e.  number  of  calculations,  M  =  37)  is  not  too  small,  the 

points  would  lie  on  the  curve  shown;  hence,  an   Improvement 

over  random  sampling  is  again  apparent. 

Figure  3  shows  similar  results  for  the  case   e'^  =  4  . 

The  Improvement  over  random  sampling  is  somewhat  less  than 

T 
for  e   =  2  . 

It  is  not  known  whether  an  improvement  of  this  sort 

can  be  expected  in  practical  Monte  Carlo  problems.   It  should 

perhaps  be  pointed  out  that  the  incorporation  of  systematic 
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sampling  In  a  Monte  Carlo  code  is  not  at  ail  difficult; 
tne   increase  (if  any)  in  computing  time  is  negligible; 
and  the  increase  in  required  storage  is  not  severe  -  a 
few  hundred  memory  locations  are  needed  for  the   ^i'*--'^i<- 
and,  perhaps,  for  the  current  values  of  xj'^' , .  .  .  ,Xj^'^' . 
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